N-CANDA data integration: anatomy of an asynchronous infrastructure for multi-site, multi-instrument longitudinal data capture

J Am Med Inform Assoc. 2014 Jul-Aug;21(4):758-62. doi: 10.1136/amiajnl-2013-002367. Epub 2013 Dec 2.

Abstract

The infrastructure for data collection implemented by the National Consortium on Alcohol and NeuroDevelopment in Adolescence (N-CANDA) for data collection comprises several innovative features: (a) secure, asynchronous transfer and persistent storage of collected data via a revision control system; (b) two-stage import into a longitudinal database; and (c) use of a script-controlled web browser for data retrieval from a third-party, web-based neuropsychological test battery. The asynchronous operation of data transmission and import is of particular benefit, as it has allowed the consortium sites to begin data collection before the receiving database infrastructure had been deployed. Records were collected within 86 days of funding, 35 days after finalizing the collected instruments. Final instruments were added to the database import 225 days after instrument selection, with up to 173 records already collected at that time. Thus, the concepts implemented in N-CANDA's data collection system helped reduce project start-up time by several months.

Keywords: data integration; informatics; longitudinal data collection; revision control system.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adolescent
  • Adolescent Development / drug effects*
  • Alcohol Drinking / adverse effects*
  • Biomedical Research
  • Computer Systems
  • Data Collection / methods*
  • Database Management Systems*
  • Humans
  • Information Storage and Retrieval
  • Longitudinal Studies
  • Web Browser*